Performance Analysis of Clustering Algorithms for Gene Expression Data

نویسندگان

T. Chandrasekhar

K. Thangavel

E. Elayaraja

چکیده

Microarray technology is a process that allows thousands of genes simultaneously monitor to various experimental conditions. It is used to identify the co-expressed genes in specific cells or tissues that are actively used to make proteins, This method is used to analysis the gene expression, an important task in bioinformatics research. Cluster analysis of gene expression data has proved to be a useful tool for identifying co-expressed genes, biologically relevant groupings of genes and samples. In this paper we analysed K-Means with Automatic Generations of Merge Factor for ISODATAAGMFI, to group the microarray data sets on the basic of ISODATA. AGMFI is to generate initial values for merge and Spilt factor, maximum merge times instead of selecting effic ient values as in ISODATA. The initial seeds for each cluster were normally chosen either sequentially or randomly. The quality of the final clusters was found to be influenced by these initial seeds. For the real life problems, the suitable number of clusters cannot be predicted. To overcome the above drawback the current research focused on developing the clustering algorithms without giving the initial number of clusters.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis

Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...

متن کامل

خوشه‌بندی داده‌های بیان‌ژنی توسط عدم تشابه جنگل تصادفی

Background: The clustering of gene expression data plays an important role in the diagnosis and treatment of cancer. These kinds of data are typically involve in a large number of variables (genes), in comparison with number of samples (patients). Many clustering methods have been built based on the dissimilarity among observations that are calculated by a distance function. As increa...

متن کامل

Assessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories

In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...

متن کامل

به کارگیری روش‌های خوشه‌بندی در ریزآرایه DNA

Background: Microarray DNA technology has paved the way for investigators to expressed thousands of genes in a short time. Analysis of this big amount of raw data includes normalization, clustering and classification. The present study surveys the application of clustering technique in microarray DNA analysis. Materials and methods: We analyzed data of Van’t Veer et al study dealing with BRCA1...

متن کامل

Use of the Improved Frog-Leaping Algorithm in Data Clustering

Clustering is one of the known techniques in the field of data mining where data with similar properties is within the set of categories. K-means algorithm is one the simplest clustering algorithms which have disadvantages sensitive to initial values of the clusters and converging to the local optimum. In recent years, several algorithms are provided based on evolutionary algorithms for cluster...

متن کامل

Fuzzy Clustering Approach Using Data Fusion Theory and its Application To Automatic Isolated Word Recognition

In this paper, utilization of clustering algorithms for data fusion in decision level is proposed. The results of automatic isolated word recognition, which are derived from speech spectrograph and Linear Predictive Coding (LPC) analysis, are combined with each other by using fuzzy clustering algorithms, especially fuzzy k-means and fuzzy vector quantization. Experimental results show that the...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1307.3549 شماره

صفحات -

تاریخ انتشار 2012

Performance Analysis of Clustering Algorithms for Gene Expression Data

نویسندگان

چکیده

منابع مشابه

Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis

خوشه‌بندی داده‌های بیان‌ژنی توسط عدم تشابه جنگل تصادفی

Assessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories

به کارگیری روش‌های خوشه‌بندی در ریزآرایه DNA

Use of the Improved Frog-Leaping Algorithm in Data Clustering

Fuzzy Clustering Approach Using Data Fusion Theory and its Application To Automatic Isolated Word Recognition

عنوان ژورنال:

اشتراک گذاری